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CN ■ Abstract 

> ' 

_^ ■ This paper studies a method to estimate the parameters governing the distribution of a stationary 

l/-s I marked Gibbs point process. This procedure, known as the Takacs-Fiksel method, is based on the 

ff) . estimation of the left and right hand sides of the Georgii-Nguyen-Zessin formula and leads to a family 

£/") ' of estimators due to the possible choices of test functions. We propose several examples illustrating 

the interest and flexibility of this procedure. We also provide sufficient conditions based on the model 
£— >s ■ and the test functions to derive asymptotic properties (consistency and asymptotic normality) of the 

f ^ ' resulting estimator. The different assumptions are discussed for exponential family models and for 

a large class of test functions. A short simulation study is proposed to assess the correctness of the 

methodology and the asymptotic results. 
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1 Introduction 

Spatial point pattern data arise in a wide range of applications and since the seventies |10j , |31j various 
statistical methods have been developed to study these kinds of data (see [3B] , [37] or [37] for recent re- 
views). In particular, a spatial point process is often modelled as the realization of a Gibbs distribution, 
defined through an interaction function, also called Hamiltonian. Gibbs models are extensions of the 
well-known Poisson process since they constitute a way to introduce dependence between points. Infer- 
ence for parametric models in this setting has known a large development during the last decade. The 
most popular method to estimate the parameters is certainly the maximum likelihood estimator (MLE) . 
It involves an intractable normalizing constant, but recent developments in computational statistics, in 
particular perfect simulations, have made inference feasible for many Gibbs models (see [3]). Although 
the MLE suffers from a lack of theoretical justifications (only some results for sparse patterns are pro- 
posed in |33j). some comparison studies, as in |18| . have shown that it outperforms the other estimation 
methods. Nevertheless, the computation of the MLE remains very time-consuming and even extremely 
difficult to perform for some models. It is thus necessary to have quick alternative estimators at one's 
disposal, at least to propose relevant initial values for the MLE computation. The maximum pseudo- 
likelihood estimator (MPLE for short) constitutes one of them. Proposed by Besag in [7] and popularized 
by J.L. Jensen and J. M0iler in [29] and A. Baddclcy and R. Turner in [TJ for spatial point processes, 
this method has the advantage of being theoretically well- understood (see [9], [16]) and of being much 



faster to compute than the MLE. Another estimation procedure is the Takacs-Fiksel estimator, which 
arose from [19], [45], [20]. It can be viewed, in some sense, as a generalization of the MPLE. As a matter 
of fact, the Takacs-Fiksel method is not very popular, nor really used in practice. The main reason is 
certainly its relatively poor performances, in terms of mean square error, observed for some particular 
cases as in |18j . However, we think that this procedure deserves some consideration for several reasons 
that we expose below. 

The Takacs-Fiksel procedure is based on the Georgii-Nguyen-Zessin formula (GNZ formula for short). 
Empirical counterparts of the left and the right hand sides of this equation are considered, and the induced 
estimator is such that the difference of these two terms is close to zero. Since the GNZ formula is valid 
for any test functions, the Takacs-Fiksel procedure does not only lead to one particular estimator but to a 
family of estimators, depending on the choice of the test functions. This flexibility is the main advantage 
of the procedure. We present several examples in the present paper. Let us summarize them in order to 
underline the interest of this method (see Section I3T21 for more details). First, this procedure can allow us 
to achieve estimations that likelihood-type methods cannot. As an example, we focus on the qucrmass 
model, which gathers the area interaction point process as a particular case (see [3D], [H]). This model 
is sometimes used for geometric random objects. From a data set, one typically does not observe the 
point pattern but only some geometric sets arising from these points. The non-observability of the points 
makes the likelihood-type inference unfeasible. As we will show, this problem may be solved thanks to 
the Takacs-Fiksel procedure, provided that the test functions are chosen properly. 

Another motivation is the possibility to choose test functions depending on the Hamiltonian in order 
to construct quicker estimators which do not require the computation of an integral for each value of 
the parameter. This improvement appears crucial for rigid models, such as those involved in stochastic 
geometry (see [17]), for which the MLE is prohibitively time-consuming and the MPLE still remains 
difficult to implement. Moreover, for some models, it is even possible to obtain explicit estimators which 
do not require any simulation nor optimization. This is illustrated for the Strauss model. 

Therefore, it appears important to us to understand the theoretical properties of this procedure. This 
problem is the main objective of the present paper. We prove the consistency and the asymptotic nor- 
mality of the induced estimator in a very general setting. In particular, we obtain a central limit theorem 
for the Takacs-Fiksel estimator with the classical rate of convergence, i.e. square root of the volume of 
the observation domain. This asymptotic result leads to the following comment: as a quick consistent 
estimator, the Takacs-Fiksel estimator appears to be a very good starting point for refined algorithms. 
Among them, let us mention the first step Newton- Raphson algorithm (as used in [5SJ) which allows an 
accurate approximation of the MLE, starting from a consistent estimator, in only one step. Although the 
theoretical justifications are missing in the Gibbs framework, it is well-known that this procedure leads to 
an efficient estimator in the classical iid case (sec [32]). Another possibility could be to exploit the local 
asymptotic normality of the model in order to construct an adaptive estimator from the Takacs-Fiksel 
estimator. The Local Asymptotic Normality property (LAN) has only been proved for restrictive models 
in [33], but one can hope that it remains true for most Gibbs models. This procedure also leads to an 
efficient estimator. All these possibilities are interesting prospects for future investigations. 

Some asymptotic properties of the Takacs-Fiksel procedure have already been investigated in two 
previous studies: one by L. Heinrich in [25] and the one by J.-M. Billiot in [5]. These papers have 
different frameworks and are based on different tools, but they both involve regularity and integrability 
type assumptions on the Hamiltonian and a theoretical condition which ensures that the contrast function 
(associated to the Takacs-Fiksel procedure) has a unique minimum. In |25j . the consistency and the 
asymptotic normality are obtained for a quite large class of test functions. These results are, however, 
proved under the Dobrushin condition (see Theorems 2 and 3 in |25| ) which implies the uniqueness of the 
underlying Gibbs measure and some mixing properties. This condition imposes a dramatic reduction of 
the space of possible values for the parameters of the model. In [8] , the author focuses only on pairwisc 
interaction point processes (which excludes the quermass model for example) . The author mainly obtained 
the consistency for a specific class of test functions. In the case of a multi- Strauss pairwise interaction 
point process, the author also proved that the idcntifiability condition holds for the class of test functions 
he considered. 



In contrast, our asymptotic results are proved in a very general setting, i.e. for a large class of 
stationary marked Gibbs models and test functions. The method employed to prove asymptotic normality 
is based on a conditional centering assumption, first appeared in [24| for the Ising model and generalized 
to certain spatial point processes in [28) . The main restriction that this method induces is only the finite 
range of the Hamiltonian. There are no limitations on the space of parameters and, in particular, the 
possible presence of phase transition does not affect the asymptotic behavior of the estimator. Moreover, 
the test functions may depend on the parameters. This extension seems important to us because, as 
emphasized in Section 13.2.21 such test functions can lead to quick and/or explicit estimators. All the 
general hypotheses assumed for the asymptotic results are discussed. For this, we focus on exponential 
family models, that is, on models whose interaction function is linear in the parameters. We show that 
our integrability and regularity assumptions are not restrictive since they are valid for a large class of 
models such as the Multi-Strauss marked point process, the Strauss-disc type point process, the Geyer's 
triplet point process, the quermass model and for all test functions used as a motivation for this work. In 
the setting of the exponential family models, we also discuss the classical idcntifiability condition which 
is required for the Takacs-Fikscl procedure. To the best of our knowledge, this is the first attempt to 
discuss it. We will specially dwell on questions like: what choices of test functions (and how many test 
functions) lead to a unique minimum of the contrast function? We propose general criteria and provide 
examples. It seems commonly admitted that to achieve the identification of the Takacs-Fiksel procedure, 
one should at least choose as many test functions as the number of parameters. As a consequence of 
our study, it appears that one should generally strictly choose more test functions than the number of 
parameters to achieve identification. 

The rest of the paper is organized as follows. Section [2] introduces notation and a short background 
on marked Gibbs point processes. The Takacs-Fiksel method is presented in Section [3l It is based on 
the GNZ formula which is recalled in Section [3] also. Several examples of test functions are given. They 
aim at illustrating our interest in considering the Takacs-Fiksel procedure. The asymptotic results of the 
induced estimator are proposed in Section|4] Our results are obtained from a single realization observed in 
a domain whose volume is supposed to increase to infinity. Some integrability and regularity assumptions 
made for the Hamiltonian and for the test functions are discussed in this section while in Section [5] the 
idcntifiability condition is specifically dealt with. In Scction[6l the very special situation where the energy 
function is not hereditary is considered. The GNZ formula is no longer valid in this setting, but it has 
been recently extended in [TB] thanks to a slight modification. This leads to a natural generalization of 
the Takacs-Fiksel procedure. In Section [7J we propose a brief simulation study in order to assess the 
correctness of the asymptotic results we obtained in Section [U Finally, Section [5] contains the proofs of 
the asymptotic results. 

2 Background and notation 

2.1 General notation, configuration space 

Subrcgions of M. d will typically be denoted by A or A and will always be assumed to be Borel with 
positive Lcbcsgue measure. We write A <s M. d if A is bounded. A c denotes the complementary set of A 
inside M d . The notation |.| will be used without ambiguity for different kind of objects. For a countable 
set J , \J\ represents the number of elements belonging to J; For A d M d , |A| is the volume of A; For 
x E M. d , \x\ corresponds to its uniform norm while ||x|| is its Euclidean norm. For all x E M. d ,p > 0, let 
B(x,p) := {y E M. d , \\y — x\\ < p}. For a matrix M, let ||M|| be the Frobenius norm of M defined by 
IIMII 2 = ?>(M T M), where Tr is the trace operator. 

The space M d is endowed with the Borel cr-algebra B(M. d ) and the Lebesgue measure A. Let M be a 
measurable space, which aims at being the mark space, endowed with the cr-algebra M. and the probability 
measure A™. The state space of the point processes will be S> := M. d x M measured by p := A ® A™. We 
shall denote for short x m — (x, m) an element of S>. 

A configuration is a subset ip of S which is locally finite in that p\ := ipD(A x M) has finite cardinality 
Aa(v?) := \<p\\ for all A <s R d . The space f2 = O(S) of all configurations is equipped with the cr-algebra 
T that is generated by the counting variables ip — )■ \<pk>ca\ f° r an Y A <s M. d and any A E A4. Finally, 
let T = {r x ) xeRd be the shift group, where t x : il — > il is the translation by the vector — x E K rf (i.e. 



the application <p n> \~}(y. m )e<p{{y — x,m)}). For the sake of simplicity, we set ip U x rn :— ip U {x m } and 
tp\x m :=ip\{x m }. 

2.2 Marked Gibbs point processes 

Our results will be expressed for general stationary Gibbs point processes. Since we are interested in 
asymptotic properties, we have to consider these point processes acting on the infinite volume Mr. Let 
us briefly recall their definition. 

A marked point process $ is an Q- valued random variable, with probability distribution P on (17, J-). 
The most prominent marked point process is the marked Poisson process 7r z with intensity measure 
zX ® A m on S>, with z > (see for example [34] for definition and properties). For A <<= M d , let us denote 
by ir A the marginal probability measure in A of the Poisson process with intensity z. Without loss of 
generality, the intensity z is fixed to 1 and we simply write it and tt\ in place of ir 1 and ir\. 

Let 9 £ W (for some p > 1). For any A <s R d , let us consider the parametric function Va(.; 9) from 
n into M. U {+oo}. From a physical point of view, V\(<p;9) is the energy of (p\ in A given the outside 
configuration <pa°- In this paper, we focus on stationary point processes on M. d , i.e. with stationary (i.e. 
T-invariant) probability measure. For any A d M. d , we therefore consider V\{.\9) to be T-invariant, i.e. 
Va(t x ip; 9) — Va((^; 9) for any x £ M d . Furthermore, (Va(.; 9)) A< ^ R d is a compatible family of energies, i.e. 
for every AcA'g M. d , there exists a measurable function ip a,A' from $7 into M U {+00} such that 

V^gO VA>{<PiO) = VA(<p;0)+4AM<PA.-'iO)- (!) 

A stationary marked Gibbs point process is characterized by a stationary marked Gibbs measure 
usually defined as follows (see j41j). 

Definition 1 A probability measure Pg on Q, is a stationary marked Gibbs measure for the compatible 
family of T -invariant energies (V\(.; 0))AgR d *// or every A <g M. d , for Pg-almost every outside configura- 
tion if \c , the law of Pg given (pA<= admits the following conditional density with respect to tta ■' 

/a(va|¥>a«;0) = 7 f 1 a , e~W\ 

where Za[}Pa^\&) is a normalization called the partition function. 

The existence of a Gibbs measure on tt which satisfies these conditional specifications is a difficult 
issue. We refer the interested reader to [J2], [H], [S], [T3], [TS] for the technical and mathematical 
development of the existence problem. 

In a first step, we assume that the family of energies is hereditary (the non-hereditary case will be 
considered in Section [5]), which means that for any A <e M. d , for any tp £ Q,, and for all x m £ A x M, 

VA(v;8) = +oo^V A (vUx m ;6) = +oo, (2) 

or equivalently, for all x m £ (pA, /a(<£a|<£A o ; &) > => /a(va \ x m \tpA<>; 0) > 0. 
The minimal assumption of our paper is then: 

[Mod]: For any 9 £ &, where is a compact subset of W, there exists a stationary marked Gibbs 
measure Pg for the compatible T-invariant hereditary family (Va(.; #))AgK d ■ Om" data consist in the 
realization of a marked point process $ with stationary marked Gibbs measure Pg* , where 9* £ & 
is the unknown parameter vector to estimate. 

Let us note that [Mod] ensures the existence of at least one stationary Gibbs measure. When this Gibbs 
measure is not unique, we say that the phase transition occurs. In this situation the set of Gibbs measures 
is a Choquet simplex and any Gibbs measure is a mixture of extremal ergodic Gibbs measures. If the 
Gibbs measure is unique, it is necessary ergodic (see [55] for more details about these properties). 

In the rest of this paper, the reader has mainly to keep in mind the concept of local energy defined as 
the energy required to insert a point x m into the configuration ip and expressed for any A 9 x by 

V (x">; 9) := V A & U x m ; 9) - V A (v, 0). (3) 

From the compatibility of the family of energies, i.e. ([T]), this definition does not depend on A. 



3 The Takacs-Fiksel estimation procedure 

3.1 Presentation 

The basic ingredient for the definition of the Takacs-Fiksel method is the so-called GNZ formula. This 
equation was proved by Georgii, Nguyen and Zessin in the seventies but other authors such as Papangclou 
and Takahashi also contributed to its establishment. See [ID] and |3SJ for historical comments and |H] 
or [35] for a general presentation. 

Lemma 1 (Georgii-Nguyen-Zessin Formula) Under [Mod], for any measurable function h(- t -;6) : 
S> x fi — > R such that the following quantities are defined and finite, then 

E ( [ h (x m , $; 9) e- V( - xm ^' r) n{dx m )) = E ( V h {x m , $ \ x m ; 9) ) , (4) 

\Jn*xM J V^-e* / 

where E denotes the expectation with respect to Pg* . 

For stationary marked Gibbs point processes, ([4]) reduces to 

E (h (0 M , $; 9) e -v(o M \*;e*)^ = E ( h ( Q M $ \ q m. ^ ^ (5) 

where M denotes a random variable with probability distribution A™. 

A second tool used throughout the paper is the ergodic Theorem established in [39] . Let us give a 
simpler form, here, which is sufficient in this paper. 

Lemma 2 (Ergodic result) Under [Mod], we assume that Pg* is ergodic. Then for any family of 
measurable functions F\, indexed by the bounded sets A, from VL to R which are additive (i.e. FauA' = 
Fa + F\i — -FahA'J, shift invariant (i.e. F\((p) = Ft(A){ t {!^)) f or an V translation t) and integrable (i.e. 
~Ei{\F\r j i\d\) < +ooJ 7 we have that for Pg* -almost every if 

lim |A„r 1 F A „(^)=E(F I01 , d ), 

n— ^+oo 

where A n = [— n, n] d (other regular domains (A n ) ra >i converging towards R d could be also considered). 
Let h(-, •; 9) : $ x fi ->■ R and let us define for any cp E Q, 9 e & and A <e R d 

C A (<p]h,0):= I h(x m ,<p;9)e- v( - xmiv - e) n(dx m )- ^ h(x m , <p \ x m ; 9). (6) 

JAxM xm£ipA 

Assume that we observe the realization of a marked point process $ satisfying [Mod] in a domain 
A„ . For appropriate choices of the functional h and a sequence of domain A n , then it is possible to apply 
the ergodic result in Lemma[2]to prove that the first and second terms of IA^^Ca^ (<&; h, 9) respectively 
converge Pg* -almost surely to the left and right terms of ([5]). 

The latter observation is the basic argument to define the Takacs-Fiksel method. Let us give K 
functions hk(-, ■', 9) : $ X il — >• R (for k = 1, . . . , K), then the Takacs-Fiksel estimator is simply defined by 



A' 

in 
9e© 



^):=^ F (^)=argminf]CA„(^,#) 2 , (7) 



fc=i 



where argmuiggQ F(9) means the parameter 9 which minimizes the function F. Under identification 
assumption Q16p given later, this minimum is obtained for a unique 9 provided that n is large enough. 



3.2 Some examples 

In this section, some examples of models and test functions h, involved in (O, are provided. The choices 
made in previous studies are presented in 13.2.11 The two examples presented in 13.2.21 and 13.2.31 show 
the relevance of the Takacs-Fiksel procedure to provide quick estimates. The last example, in 13.2.41 is 
concerned with the possible identification of a special marked point process, the quermass model, and 
shows that appropriate choices of test functions can solve identification problems when points are not 
observed. 

There is no asymptotic consideration in this section. Therefore, the different estimates are defined 
over a window, say A, and are denoted by 9. 

3.2.1 Classical examples 

Let us first quote the particular case when the Takacs-Fiksel estimator reduces to the maximum pseudo- 
likelihood estimator introduced by [29] for spatial point processes. The MPLE is obtained by maximizing 
the log-pseudo-likelihood contrast function, given by 

LPL A {ip;9) = - f e- v{xm ^' 9) n{dx m ) - ^ V(x m \ip\x m ;9). (8) 

Therefore, with the choice hk(x m ,tp;9) = ■^-V(x m \ip- 1 9), k = l,...,p, the estimator 9, defined by ([7]), 
solves the system C\{<p; hk, 0) = 0, k = 1, . . . ,p, which means that 9 is the root of the gradient vector of 
LPL A , i.e. 9 is the MPLE. 

The first empirical study of the Takacs-Fiksel estimator can be found in [18], where this estimate is 
compared to other estimators for some unmarked pairwise interaction point processes with two parame- 
ters. In this context different test functions h ri , • • • , h rK were used, where for r > 

h r (x,ip;9) = \<PB(x,r)\ = ^2 1 [a.r]{\\y-x\\). 

The integral term involved in ([6]) is approximated by discretization and the induced estimation ([TJ is then 
assessed. Note that with this choice of test functions, the sum term in ([5]), when normalized by |A| _1 , is 
an estimation of p 2 lC(r), where /C(-) is the reduced second order function and p denotes the intensity of 
the stationary point process $, i.e. for all B <E B{M. d ), E($(B)) = p\B\. 

The latter choice requires the computation of the integral in (J5]) for all 9. A more convenient choice 
could be the one first proposed by Fikscl: 

h r (x, V ;9) = \ VB{x ^\e v ^^ = e^^Yho^h - x\\). (9) 

In the stationary case, this leads to the following approximation thanks to the ergodic theorem 

^-J h r (x,<p;6)e- v W^dx *V\$ B(0!r) \ = pirr 2 . (10) 

The integral term in ([5]) is thus easily approximated by \(p\\irr 2 for all 9, while the sum can be explicitly 
computed. These historical choices of test functions have a natural extension in the marked case. 

3.2.2 Some choices leading to quick estimations 

The main advantage of the Takacs-Fiksel procedure is to provide quick consistent estimators, that might 
supply initial values for a more evolved procedure. A simple way to achieve this goal is to generalize ([9]) 
and consider test functions of the form 

h(x m ,(p;6) = h{x m 1 i P )e v{xm ^ e \ 

where h(x m , <p) does not depend on 9. So, the integral term in ([6]) has to be computed only once and not 
for all 9, while the sum term in (|6|) does not require any approximation. Hence, the optimisation problem 
(0 may be resolved very quickly. 

In some particular examples, explicit formulas may even be obtained for the integral term, as in (|10[) . 
In the same spirit, an explicit estimator for the Strauss interaction is provided below. 



3.2.3 An example of explicit estimator for the Strauss process 

The (non-marked) Strauss process with range of interaction R > is given for any A g R d by 

VA(<fr,8) = 6i\<p A \ + e 2 J2 Mo,SiQ\x-y\\), (11) 

{x,y}nA^0 

where 6\ E M and 82 > are the two parameters of the model. Alternatively, 

V(x\<p;6)=6 1 + 6 2 J2l[o,R](\\y- x W)- 

ye<f> 
Let us consider the following family of test functions, for k E N \ {0}, 

hk(x,ip;9) = < ' \' (12) 

I otherwise. 

This choice gives in (j6)) 

C A (p; h k , 9) = e-^mfv) - e^-^JVfc.i.AfoO, 

where Nk^i'fi) denotes the number of points x E </?a such that \B(x, R)D(ip\x)\ = k and V^aCv?) denotes 
the volume of the set {y E A, \B(y, R) H ip\ = k}. 

Several explicit estimators may be obtained following (|7J| from (at least) two test functions as above. 
Let us quote the simplest one, corresponding to the choice hi and hi in IjT]). This leads to the contrast 
function Ca(<^; hi, 9) 2 + C^{ip; h 2 , 9) 2 which vanishes at the unique point (6i(tp), # 2 (<p)) with 

This estimator of (6\, 62) is completely explicit, provided the quantities Nk,A(<p) and V^hif) are available. 
They can be easily approximated by computational geometry tools. 

3.2.4 A solution for unobservability issues 

The quermass model introduced in |30j is a marked point process which aims at modelling random sets in 
M. 2 . This is a generalization of the well-known Boolean model to interacting random balls. Let us denote 
by x R a marked point where x and R > (i.e. the mark) respectively represent the center and the radius 
of the associated ball B(x, R). For a finite configuration <p, i.e. with a finite support instead of M 2 , the 
quermass energy is defined for (61,62, 63, 64) E M. by: 

V(^,6)=9i\^\+82V{r) + 8 3 A(r) + d 4 £(r) where r = (J B{x,R) 

(x,R)eip 

and V(T), A(T) and £(T) denote respectively the perimeter, the area and the Euler-Poincare characteristic 
(i.e. number of components minus number of holes) of the set T. To extend this definition to the infinite 
support R 2 , it is convenient to suppose that the radii of the balls are almost surely uniformly bounded 
(i.e. A m ([0, Rq\) = 1 for some R$ > 0). In this case the family of energies (Va) is defined by 

Va (tp\ 9) = V (<PA®B(0,2R o y,6) - V {Vh®B{0,2R )\h\ 6) ■ 

This definition may be extended to unbounded radius, though a restriction to the so-called tempered 
configurations is needed to ensure the existence of the associated Gibbs measure. We refer to [M] for 
more details. 

When 6*2 = #3 = #4 = 0, this model reduces to the Boolean model (see [44] for a survey). The area 
process (see [3]) is also a particular case, taking # 2 = #4 = 0. 



In practice, one only observes the random set T, so the marked points x R in ip are unknown. A 
challenging task is then to estimate the parameters (&i, 62, 63, 64) in the presence of this unobservability 
issue. In particular, a direct application of the maximum likelihood or pseudo-likelihood method is 
impossible to estimate all the parameters, and especially 6\ which requires the observation of the number 
of points in p. For the other parameters, which are related to the observable functionals V, A and £ , the 
MLE has been investigated in [55] . 

Let us show that the Takacs-Fikscl procedure may be used to estimate B\ in spite of this unobservability 
issue. Indeed, it is possible to choose some test function h such that both the integral and the sum in © 
are computable. The unobservability issue occurs mainly for the sum term. Let us consider the following 
example of test function: 

h per {x R ,(p;0)=V(C(x,R)nT c ), (14) 

where C(x,R) is the sphere {y, \\y — x\\ = R}. For any finite configuration ip, we then have 

Y, h per (x R ,p\x R ;9)=V(T), 

so that this sum is computable even if each term h per (x R ,p \ x R ;9) is not. If the configuration p is 
infinite then for any bounded set A, 5Z x R g( . h per (x R , p \ x R ; 6) is equal to the perimeter of T restricted 
to A plus a boundary term which is asymptotically negligible with respect to the volume of A. 

Consequently, assuming (02,63,64) known, 6\ may be estimated thanks to ([7]) with the above choice 
of test function. 

The joint estimation of all four parameters might be achieved thanks to additional test functions 
sharing the same property as above, i.e. such that the sum in ([6]) is observable. 

In the particular case of the area process, 62 = 64 = and R is constant (i.e. A™ = Sr), it suffices 
to find one more test function to ensure an identifiable estimation (see Example [2] in Section [5j for more 
details). A possible additional test function is 

h ( R m Jl if V(C(x,R))=2nR 

h iso (x n ,p;6) = i . (15) 

I otherwise. 

In this case Yl x R ew hi SO (x R , f \ x R ; 6) corresponds to the number of isolated balls in T. 

4 Asymptotic results for the Takacs-Fiksel estimator 

We present in this section asymptotic results for the Takacs-Fikscl estimator for a point process satisfying 

[Mod] and assumed to be observed in a domain A„, where (A„)„>i is a sequence of increasing cubes 

whose size goes to +00 as n goes to +00. 

First, for a function g depending on 6, we denote by g^(6) (resp. g*- 2 -'^)) the gradient vector of 

length p (resp. the Hessian matrix of size (p, p)) evaluated at 6. Let us rewrite the Takacs-Fikscl estimator 

as 

6 n (p) = argmint/ A „(( ( a;h, 6), 
ee& 

with U An (ip; h, 6) = |A„|~ 2 J2k=i Ca„(^; h k ,6) 2 , where h = {hi, . . . ,h K ) and C A „ is given by ([BJ. 

4.1 Consistency 

The consistency is obtained under the following assumptions, denoted by [C]: for any Gibbs measure 
P e * , for all 6 € &, k = 1, . . . , K and ip € ft 

[CI] For all x £ R d , h k (x m ,ip; 6) = h k (O m ,T x p; 6) and 

E(|/ lfe (0 M ,$;^|e- y (° AJ l^')) <+oo, for ^ 0,0*. 

[C2] U\ n (ip; h, •) is a continuous function for Pg*~ a.c. p. 



[C3] 

^E(/i fc (O M ,$;0)(e- v '( oM l^-e- v '( oSJ l*^))) 2 = O =^ = 6\ (16) 

k=i 

[C4] hk and /&, defined by fk(x m ,tp;9) := hk{x m , tp\ 9)e~ v ^ xm ^ e \ are continuously diffcrcntiable and 
E[ma_x|/ fc (0 M ,$;6i)| J < +00 and E ( max\h k (0 M , $; ^le - ^ " 1 *'"^ ) < +00, 

E(ma_x||f k (1) (O M ,$;60||") < +00 and E fma_x ||h k (1) (0 M , $; 6»)|| e - y ( oJ,/ l* ;e *A < +00. 

Theorem 3 Assuming [Mod] and [C] then, as n — > +00, f/ie Takacs-Fiksel estimator O n (tp) converges 
towards 6* for Pg* — a.e. ip. 

Assumptions [CI], [C2] and [C4] are related to the regularity and the integrability of the different 
test functions and the local energy function. Some general criteria may be proposed to verify these 
assumptions, see Section 23] for a discussion. Assumption [C3] corresponds to an idcntifiability condition 
and requires much more attention. It is well-known that such an assumption is fulfilled when h = V^ 1 ' 
(leading to the MPLE) under mild assumptions (see Assumption [Ident] proposed by [9]). The question 
to know if this remains true for more general test functions is difficult (actually it is untrue in several 
cases). This will be discussed specifically in Section [5] 

4.2 Asymptotic normality 

We need the following assumptions denoted by [N]: For any Gibbs measure Pg<. , k = 1, . . . ,K, A <g M. d , 
ip G and 9 in a neighborhood V(9*) of 9*: 

[Nl] E(\C A ($;h k ,9*)\ 3 ) <+oo. 



[N2] For any sequence of bounded domains T n such that r„ — )■ as n — > +00, E (Cr„ (<&; hk, 9*) 2 ) — > 0. 
[N3] C\(<p; hk, 9*) depends only on <Pa®b(o,d) f° r some D > (which is uniform in A, ip, 9*). 
[N4] hk and fk (defined in [C4] ) are twice continuously diffcrcntiable in 9 and 

E(||h^(0 M ,$^)||e- v ( oM l*^)) <+oo and E (||f (2) (0 M $;0)ll) < +00. 

Let us remark that Assumption [N3] leads us to consider in general that V has a finite range, which 
means that there exists D > such that for all (to, ip) G M x fl and all 9 G 

V(0 m \p;9) = V(0 m \ m oMy,9). 

The same kind of finite range property is also expected for (hk)- 

Theorem 4 Under Assumptions [Mod], [C] and [N], for any ergodic Gibbs measure Pg* the following 
convergence in distribution holds as n — > +00 

\A n \^ 2 £(h,9*)£(h,9*) T (8 n ($)-6*) 4A/-(0,£(h,nS(h,r)£(h,r) T ) , (17) 

where £(h, 9*) is the (p, K) matrix defined for i = 1, . . . ,p and k = 1, . . . , K by 

(£(h,e*)) ik = E(h k {0 M ,$;0*) (v«(0 M |*;n).e- V( ° M| * ;r) 

and where S(h,#*) is the (K,K) matrix defined by 

S(h, 9*) =D- d J2 E (c Ao(D) ($; h, 9*)C AdD) ($; h, 0*f) , (18) 

\t\<l 

where, for all k G 7L d , Afc(D) is the cube centered at kD with side-length D and where, for any bounded 
domain A, C A ($; h, &*) := (<7 A ($; h k ,9*)) k=1 K . 



Remark 1 While Assumptions [Nl-3] will ensure that a central limit theorem holds for XJ A ($;h, 8*), 

Assumption [N4] is required to prove that Ujy ($; h, 0) is uniformly bounded in a neighborhood of 8* . 
These two statements allow us to apply a general central limit theorem for minimum contrast estimators 
(e.g. Theorem 34.5 of JEE/). 

Remark 2 In TheoremU\ if the Gibbs measure Pg* is stationary (and then not necessarily ergodic), it 
is a mixture of ergodic measures and the left hand term in J_?7| ) converges in distribution to a mixture 
of normal distributions. We may also propose an asymptotic result valid in the presence of a phase 
transition. Indeed, if the matrix £_(h, 0*)S(h, 8*)£_ (h, 8*) is positive definite (for any extremal ergodic 
measure of the Choquet simplex of stationary Gibbs measures), then we may define a consistent empirical 



-1/2 



version S($) of £(h, 0*)£(h, 0*)£(h, 8*) 1 £{h, 6*)£(h, 8*) 1 (see Section 4 of |Jj| for more details) 

to obtain: 

Remark 3 Following Section \3.2.1[ let us underline that JJ7[ ) is coherent with the asymptotic normality 
of the MPLE established in fllf . i.e. with the case K = p and h = V' 1 ' . Indeed, with similar assumptions 
to the ones presented in the present paper, Equation (4-4) in Theorem 2 flljj states that 

|A„r /2 A(n(M4>)-r) 4^(O,S(V (1) ,0*)) 5 (19) 

where A(0*) is the symmetric (p,p) matrix given for i, k = 1, . . . ,p by 

(A(0*)). fc = e((v«(O a/ |$;0*)) (v«(O M |<I>;0*)) e ^(0*;n). 
Since h = V' 1 ', then £(V^,8*) := A(0*) and therefore {7^ reduces to 

|A„| 1/2 A(0*) 2 ($ n (§)-&*\ ^A^(O,A(0*)S(V( 1 ),0*)A(0*)' 

which is exactly il9\) by assuming that A(0*) is invertible. 

Remark 4 The question how to choose the test functions in order to minimize the norm of the asymptotic 
covariance matrix is difficult to answer, still open and is a perspective for future work. 

4.3 Discussion 

The present paragraph is devoted to the discussion of Assumptions [Mod], [C] (except [C3]) and [N]. 
In the previous sections, we have expressed the different assumptions in a very general way. Our aim, 
here, is to make these assumptions concrete for a wide range of models and a wide range of test functions, 
in order to illustrate that our setting is not restrictive. In particular, we will focus on exponential family 
models having a local energy of the form: 

V{x m \ip; 8) := T V(a; m |(^) = 8 1 V 1 (x m \^) + ... + 8 p V p {x m \ip), (20) 

with V = (Vi, . . . , V p ) a vector function from $ x fi($) to W. 

Let us consider the following assumptions: for all (m, ip) € M x il. 

[Exp] For i = I,--- ,p, for all x E R d , Vi(x m \<p) = Vi(0 m \T x cp) and there exist K f nf) , n<f up) > 0, h G 
N, D > such that one of the two following assumptions is satisfied : 

8i > and - K? ni) < Vi(0 m \ip) - ^(0™| mo , D) ) < 4 up) \ip B(0 ,D)\ ki - 
or 

-4 nf ) < v^ m W) = Vi(o m \<p B(0 , D) ) < «j sup) . 



[Exp] Assumption [Exp] with fc, = or 1 for all i (when 6i > 0). 

[H] For all x G R d , h(x m ,ip;9) = fr(O m ,T x <p;0) and there exist k>0, fceN,L>>0 such that 
h(0 m ,(p;9) = h(0 m , (Pb(o,d)')0)i such that /i(0 m ,<y9;-) is twice continuously differentiablc in 9 and 
such that \Y(ip, m)\ < k\<Pb(o,d)\ > where 

Y(<p, m) := max (\h(0 m , p; 0)|, ||h«(0™ <p; 9)\\, ||h (2) (0 m , p; 0)||) . (21) 

[H] /i(0 m ,^;6>) =^(0 m ,v3;6»)e 8rv (° m l^ with h satisfying [H]. 

Let us underline that the different constants involved in these assumptions are assumed to be independent 
of in, if, 9. Note also that if the test function h is independent of 9, Y(ip, m) obviously reduces to \h(0 m , ip)\. 

These assumptions are common and very simple to check. Assumption [Exp] has already been 
investigated in [9] . It includes a wide variety of models such as the overlap area point process, the multi- 
Strauss marked point process, the k— nearest-neighbor multi-Strauss marked point process, the Strauss 
type disc process, the Geycr's triplet point process, the area process, some special cases of quermass 
process (for instance when A m has a compact support not containing and #4 = 0), etc. Among these 

models, the only one that does not satisfy [Exp] is the Geyer's triplet point process (see [3], p. 242). 

On the other hand, the test functions h(x m ,<p;6) = 1, h(x m ,<p;6) = v[} } (x m \ip; 9) = V k (x m \ip), 
h(x m ,ip;9) = \<fiB(x,r)\ satisfy [H] (for the second one, it is implied by [Exp]). Note that the func- 
tional described by (|12p for the Strauss model depending on 9, the functionals h per and hi s0 in (fT4")l , 
(|T5|) also satisfy [H]. In a similar way, test functions like e^v^M ; e ff r v(* m \<p)/2 i ^^^e^i^W) ^ 

l [0 ^(d{x m ,ip))e eTw{ ~ xm ^ satisfy [H]. 

We show that most of all the assumptions required in Theorems [3] and H] are not too restrictive. 



Proposition 5 (i) Under Assumption [Exp], Assumption [Mod] is fui_ 

(ii) For a test function satisfying [H] (resp. ]HJ) and a model satisfying [Exp] (resp. [Exp]), then 
Assumptions [C] (excepted [C3]) and [N] are fulfilled. 

Proof, (i) From [Exp], it is easy to check that for any fixed 9, there exists a positive constant K{9) 
such that V(0 m \ip;9) > —K{9). Therefore the local energy is stable for any 9. On the other hand it 
is finite range. These two properties imply the existence of ergodic measures (see e.g. [5], Proposition 
1). Among them, there exists at least one stationary measure due to the invariance by translation of the 
family of energy functions. 

(m) [C2] and [N3] are quite obvious to check. Now, from the stability of the local energy, we have 
that for all (m,ip) g Mx J], 9 T V(0 m \ip) > —p for p < +oo, independent of m,ip,9 (this is possible 
because © is compact). Let us also underline that this property ensures that for every A <s M. d , every 
c G R, E(e c '* A ') < +oo (sec e.g. Proposition 11 of [6]), which obviously implies that E(|$a|") < +oo and 

E(|$ A | a e c| * A| ) < +oo for every a > 0. Now, under [H] and [Exp] (or [H] and [Exp]), the expectations 
in [CI], [C4], [Nl] and [N4] are clearly finite. Let us focus on [CI] for example (the justification for 
the other assumptions is similar). We have for any 9,9' G 



E 



h{0 M ,§;d)\e- 6fr ^° M W) 



< 



e p E (|/i(0 M ,$; 9) |) < c x e p E (\$ B (o,D)\ a ) under [H] and [Exp] 

e"E(|/i(0 M ,$;6')|e eTv ( oAJ l <I> )) < c X e"E (l^B^D^e ^ 13 ^ 1 ) under [H] and [Exp'], 



for some constants a and c. Note that, if we had not assumed [Exp] for test functions of the form [H], 

then one would have had expectations of the form E I e'* A ' ) for some k > 1 which is not necessarily 

finite under the local stability property Assumption [N2] is proved similarly and by using the dominated 
convergence theorem. ■ 



Remark 5 By following ideas in Jllf . it is possible to fulfill the integrability type assumptions for more 
complicated models such as the Lennard- J ones model (which is not locally stable and nonlinear in terms 
of the parameters) . The using of Ruelle's estimates J^ffi plays a crucial role in this case of superstable 
interaction. For the sake of conciseness and simplicity, we do not investigate this in the present paper. 

5 Identifiability : Assumption [C3] 

The Assumption [C3] is related to the identifiability of the estimation procedure. It is more complicated 
to verify than the other assumptions and an investigation to obtain a criterion or a characterization seems 
necessary. We address this question in this section. 

In the following, we consider that the interaction has an exponential form as in ([20]) . Then [C3] is 
equivalent to: 9 — 9* is the unique solution of the nonlinear system of equations in 9 defined by 

E (M0 M ,<M) (e- 8 ^ "!*) _ e - e * rv (°"l*))) =0, 1 < k <K. (22) 

If hk and V are sufficiently regular, each equation in (|22[) gives a (jp— l)-dimensional manifold of solutions 
in containing 9* . So it is clear that the choice K > p is in general necessary to prove that the system 
(f2"2"| admits the unique solution 9* . 

In Section [5. 11 we investigate the delicate case K = p in detail. In opposition to the linear case where 
p hyperplancs in W have in general a unique common point, the intersection of p (jp — l)-dimcnsional 
manifolds does not generally reduce to a single point. So, when K = p, there is no guarantee that (|22p 
has a unique solution 9*. This is illustrated by a simple example at the beginning of Section 15.11 In 
Proposition |51 we provide a criterion to ensure that the system in (f2"2"j) admits the only one solution 9* . 
Some examples, for which the criterion is available, are presented and the rigidity of the criterion, when 
p > 3, is also evoked. In the case where p = 2, we show that our criterion is not far from being necessary. 

The case K > p is studied in Section 15.21 The identification problem should be simpler since, in 
general, p + 1 (p — l)-dimensional manifolds in R p have no common point. We give a sufficient criterion 
to prove the identification but we think that it is far from being necessary. 

Before presenting these two sections, let us give further notation. We denote by Py the law of 
V(0 M , $) in W. We also define the function * e , for each 8 e 0, by 



:,K 



* fl : W — > 

( E(/n(O A/ ,$;0) V(0 m |$) = v) \ 

(23) 

y E(/itf(O M ,$;0) V(0 Ai |$) = vJ j 

We will see that this function plays a crucial role in the identification problem. 

5.1 The case K = p 

First of all, let us give a simple example to show that the identification problem is delicate in the situation 
where K = p. Let us consider that K = p = 2, V\ = 1, and let us choose the simple test functions hi = 1 
and/i 2 = e « fr v( 3! "*| ¥3 )_ Then 9 = (0i,0 2 ) with & = 6»*-ln(E( e - e 2^(o J,/ |*))) and q 2 = o is always a solution 
of the system in (1221) . Therefore if 9^ ^ and if 9 defined before is in 0, then the system in (|22|) admits 
at least two solutions. 

In the following, we first give a sufficient criterion to prove the identifiability and propose some 
examples. Next, we show the rigidity of our criterion which seems constraining when p > 3. 

5.1.1 Criterion for identifiability 

Assumption [Det] gathers the two following assumptions: 
[Det(^)] For every 9 in 0, det(vi, . . . , v p )det(*e(vi), . . . , *e(v p )) is not (Py)® p -a.s. identically null 



[Det(>)] For every 9 in 0, there exists e = ±1 such that for (iV)® p -a.s. every (vi, . . . , v p ) in (W) p 

e det(vi, . . . , v p )de*(*u(vi), . . . , *e(v p )) > 0. 

When e = 1 (respectively e = —1), [Det(>)] means that ty@ preserves the sign (respectively the opposite 
sign) of the determinant. 

The criterion is the following. 



Proposition 6 If K = p then Assumption [Det] 



that Assumption [C3] holds. 



Proof. Denoting by Q the real function x >-»• In ( e ) with the convention £(0) = 0, the equations (|22|) 
become 

(0*-0) T X fc (0,0*) = O, l<fc<p, (24) 

where the vector X/ c (0, 0*) is defined by 



X fc (0,0*) =E (h k (0 M ^;e)e- etT ^ oM ^e ( (^- e)T ^° M ^)v(0 M \^) 



(25) 



Therefore the system (f22j) admits the unique solution 0* if the family of vectors (Xfc(0, #*))i<fc<p is 
independent in R p . Let us give a formula of the determinant of these vectors which shows that it is not 
null. 

Conditioning by the law of V(0 M |$) and using the multi-linearity of the determinant, we obtain 

det(Xi(0,0*),...,Xp(0,0*)) 

•• f det (e L(O M ,<&;0) e - e * rvi e c ( (e *- e)rvi ) 

I V(O M ,$;0) e - e * Tv "e C ( (r - e)rVp )v p V(0 M |$) 



V(0 JU |$)= Vl 



iV(dvi)..-iV(dv P ) 

/• • • [e'Z P *=i- e ' Tv * + t( (e *- e r v *)det(v 1 , . . . ,v p ) f[ E [h fc (0 M , $; 0)|v fc ] Py(d Vl ) • • -P v (dv p 
■* •* fe=i 



a£S, 



J E [/i fc (O M ,$;0)|v CT(fc) ] iV(dvi) • ••iV(dv p ), 



fe=i 



where Sp is the set of all permutations in {1, . . . ,p}. Denoting by e(o~) the signature of a, we obtain 

det(Xi(0,0* ),..., X p (0,0*)) (26) 

= i. /".. . /■ e ^-- 8 ' T ^^-^0det(v 1) ... > v p ) £ e(«r)nE[li»(0 ,l I *;«)W 



<tGS„ 



fc=l 



iV(dVi)---iV(dv„ 
1 



P 



E£ =1 -fl* T v fc +c((fl*-ef v0det( Vl , . . . , v p )det(*tf(vi), . . . , * e (v p ))Py(d Vl ) • ■ • P y (dv p ). 



From Assumption [Det] , this determinant is not null. The proposition is proved. H 
Now let us give some examples for which the criterion is valid. 

Example 1 (linear case) If the function tyg is linear and invertible then Assumption [Det(>)] is 
clearly satisfied and [Det(^)] holds as soon as the support of Py is not included in a hyperplane. In 
particular, if hk — Vk for every 1 < k < p then ^ g is equal to the identity function. This situation 
corresponds to the pseudo-likelihood procedure for which we regain the identifiability via our criterion. 



Example 2 (area process) For the area process defined in Section \3.2.4\ with A m = 5r (i. e. the radii 
of balls are constant), it is easy to check that the functions h per and hi SO respectively defined by (|14[) 
and (|15|) give a function VP which satisfies Assumption [Det] . Indeed the support of Py is the segment 
{1} x [0,-kR 2 ] inM 2 and for a vector v = (l,^) the image ^(v) is (^i(u 2 ),0) if v 2 ^ ttR 2 and (2ttR,1) 
if 1)2 = ttR 2 . Therefore, it follows that [Det(>)] is satisfied and noting that < Py ((1, 7r_R 2 )) < 1 we 
deduce that [Det(^)] holds too. 

Example 3 (a general example with p = 2) Example [H is included in a more general setting when 
p = 2. Indeed let us suppose that the function ^g has the form ^g(i'i,V2) = {ge(vi,V2),gg(vi,V2)fg(v2/vi)) 
where gg is a nonnegative scalar function and fg is a monotone scalar function. Then \&g satisfies 
[Det(>)] and [Det(^)] holds if gg(vi,V2)fg(v2/vi) is not Py -a.s. constant when gg(v\,V2) is not 



Example 4 (functions of the type hke ) Let us suppose that the functions (hk) ensure that tyg 
satisfies [Det(>)] then for any nonnegative function gg from MP to R the functions (hk) = \9o(V)hk) 
also provide a function ^g satisfying [Det(>)] . This remark is related to Section \3.2.2\ where it is 
suggested to choose functions (hk) of the form (e v /ifc) to simplify the integral in {6j). 
As an immediate consequence, the test functions (l4e ), considered in JB}j for the particular multi- 
Strauss point process, satisfy [Det(>)] . 

5.1.2 Rigidity of the criterion 

In this section, we give some comments about the rigidity of the criterion. In Proposition [7] below, we 
show that a function tyg, satisfying [Det(>)] , has a strong linear structure since, under very reasonable 
assumptions, the image of any hyperplane is included in a hyperplane. For example, in the classical 
setting where V\ = 1, the function ^g is defined from the affine space H = {1} x W~ l and if *$>g is 
assumed to be continuous then the image of any p — 2 dimensional affine space in H is included in a 
hyperplane. This property clearly shows that tyg is very rigid when p > 3. 

However, when p = 2, we show in Proposition [8] that our criterion is not far from being necessary. 
Indeed, we present a large class of examples which do not satisfy our criterion and for which the idcnti- 
fiability fails. 

Proposition 7 Let \P be a continuous function from T> to W satisfying [Det(>)] , where the domain T> 
is a subset of MP with the following property: for any (xi)\<i< p € T> p such that det(x\, . . . ,x p ) = 0, then 
for any neighborhood V of (xi), there exist (x~[) and (x~) in V D T> p such that det(xj , . . . , ad~) > and 
det(xi , . . . , Xp ) < 0. Then for any hyperplane H in M. p the image ^(H l~l T>) is included in a hyperplane 

ofW. 

Proof. Let H be a hyperplane in M. p . To prove that ^(H D T>) is included in a hyperplane, it is 
sufficient to prove that the dimension of the vectorial space generated by the vectors in ty(H n V) is 
not equal to p. Let us suppose that it is equal to p, then there exists (xi)i<i< p in (H n T>) p such that 
cfe£( , I'(xi), . . . , fy(xp)) ^ 0. Since dim(H) = p — 1, we have det(xi, . . . ,x p ) ~ 0. By continuity of "J 
and by the local properties of T> assumed in Proposition we find (xf)i<i< p and (x~)i<i< p in D p such 
that det(xf , . . . ,x+)detfi>(xf ),..., *(x+)) > and det(x^ ,... ,x~)det(V(x^ ),..., ^(x~)) < 0, which 
contradicts Assumption [Det(>)] . ■ 

In the case where T> = MP ', if we assume that \P(R P ) is not reduced to a hyperplane and that *$> is 
diffcrcntiable at the origin, then we can show that ^ satisfies [Det(>)] if and only if \J/(x) = g(x)Ax, 
where A is an invcrtible matrix and g a nonnegative scalar function. It means that ^ is quasi linear and 
so the rigidity of 'J is very strong. 

Now let us focus on the case where p = 2 and let us show that, while our criterion seems very 
constraining, it is not far from being necessary in this case. We suppose that p = 2, V\ = 1 and that the 
support of Vi is included in an interval [a, b]. Let us remark that this case occurs for the area process 
with [a, b] = [0,7ri? 2 ]. First of all, it is easy to check visually, depending on the geometry of "fg defined 
by the curve ^({l} x [a, 6]), whether fyg satisfies [Det] (see figure[T]for examples). 



Moreover let us show that the criterion is not far from being necessary. Suppose that the functions 
(hi) do not depend on 9 and that \I> := &$ satisfies for e = ±1 Assumption [Det] decomposed into the 
three following assumptions: 

[Det(^)] det(v 1 ,v 2 )det(*(v 1 ),*(v 2 )) is not P^ 2 -a.s. identically null 

[Det(>)] there exists 6 > such that for P® 2 -a.s. every (vi, v 2 ) in ({1} x [a,a + 5]) 2 

edei(vi,v 2 )det(*(vi),*(v 2 )) > 

[Det(<)] there exists 5 > such that for P^" 2 -a.s. every (vi, v 2 ) in ({1} X [b - S, b}) 2 

e det (vi, v 2 ) det (tt(vi), *(v 2 )) < 0. 

See Figure[T]for an example of such ty. Obviously, this situation is not exactly the opposite of Assumption 
[Det], but it is strongly related to it. Then, we have the following proposition which proves that the 
idcntifiability fails for this large class of examples. 



Proposition 8 If the functions (hi) are nonnegative, if ^ satisfies [Det] and if 'det( i?p v (vli r (v)v r ) I 7^ 
then [C3] fails. 

Let us note that even if the assumption det I £'p v (\I/(v)v r ) ) ^ seems unnatural, it is in general 

satisfied. 

Proof. Let us show that (f2"2"j) admits another solution than 9* . We only give here the main lines of the 

proof. 

We denote by O the set in R 2 containing the vectors u which are orthogonal to at least one vector v 
in {1} x [a, b], i.e. = {u£ R 2 , 3v 6 {1} x [a, b], u T v = 0}. In fact, O is the union of a cone + in the 
upper half plane and a cone 0~ in the lower half plane. For any 8 > 0, the expression of the determinant 
in (|26[) can be split in two parts 

det(x 1 (e,e*),x 2 (e,e*)) (27) 

= Iff e^=i- e " Vfc+C ( (e *- e)Tvfc )d e t( Vl ,v 2 )det(*(v 1 ),*(v 2 )) J P y (dv 1 )P y (dv 2 ) 

1 J J[a,a+S] 2 

+\ f f e^= 1 - e * Tvfe+c ( (e *- e)Tvfe )det(v 1 ,v 2 )de*(*(v 1 ),*(v 2 ))P y (dv 1 )Py(dv 2 ). 

2 J J[a,b] 2 \[a,a+S] 2 

Let u ^ in + and 9 = 6* + au with a > 0. From [Det(^)] and [Det(>)] since £ is increasing 
and C( x ) ~ i as i -> +oo, wc deduce that the first integral in (|27[) dominates the second one when a 
goes to infinity. Therefore cfe£(Xi(#,#*),X 2 (0, 9*)) has the sign e (defined before [Det] ) when a is large 
enough. Similarly, if u is in 0~ then from [Det(^)] and [Det(<)] det(X 1 (9,9*),X 2 (9, 9*)) has the 
sign — e when a is large enough and it implies that the sign of det(Ki(9 1 9*), X 2 (#, 9*)), with 9 = 9* + u, 
is different for u in + or in 0~ as soon as |u| is large enough. We deduce that there exists a continuous 
curve t ^ u(t) which crosses O such that det(X. 1 {8(t),0*),X. 2 (0(t),O*)) = for every 8(t) = 9* + u(t). 

Let us note that the assumption deifPp v (*(v)v T )") ^ ensures that dei(Xi(0*,0*),X 2 (0*,6>*)) / 0, 

and so u(t) is never null. Let us show that there exists to such that 9(to) is a solution of the system 
in (EH). 

Since the functions (hi) are nonnegative and from Definition (|25j). we obtain that for every t, Xi (9(t),9*) 
is collinear to a vector in {1} x [a, b]. By continuity of the function t M> Xi(9(t),9*) u(t) and by the 
mean value theorem, there exists to such that u(io) is orthogonal to ~X-i(9(to),9*). Since the determi- 
nant det(Xi(9(to),9*), ~K 2 (9(to),9*)) = 0, u(i ) is also orthogonal to X 2 (9(to),9*) and it follows that the 
system in (|24| provides at least two solutions 9* and 9(to). Identification assumption (|22"T) or [C3] fail. 




Figure 1: on the left (respectively in the middle), an example of function $ satisfying (respectively not satisfying) 
[Det(>)] . On the right, an example of $ satisfying [Det] . 



5.2 The case K>p 

In the case where K > p, we noticed, in the introduction, that the identification problem should be 
simpler. Nevertheless, we did not find a satisfactory criterion to prove it. The following Proposition [9] 
gives a sufficient criterion which is probably far from being necessary. It is based on a slight modification 
of Assumption [Det] which does not seem to be the appropriate tool in this setting. However, in the 
case where p = 2 and K = 3, this condition reduces to a nice geometrical property which can be checked 
easily. 

First, let us present the criterion. We denote by A the set of all subsets with p elements in {1, ... , K}, 

A=U C {1,...,K}, such that #(I) =p\. 

We say that Assumption [Det'] is satisfied if, for every 8 in 0, there exists a family of real coefficients 
{ci)ieA such that the two following assumptions hold: 

[Det'(^)] ^cide*(vi,...,Vp)de*(#£(vi),...,tf£(v p )) is not (iV)® p -a.s. identically null 



ieA 



[Det'(>)] for (JV)® p -a.s. every (vi, 



ppvp 



Y^ cjdet(vi, . . . , v p )de*(tt£(vi), . . . , *S(v„)) > 0, 

ieA 

where vE'l(v) denotes the p-dimcnsional vector extracted from ^e(v) for the coordinates given by /. In 
the particular case where K = p, Assumption [Det'] becomes Assumption [Det] exactly. 
Our criterion is the following 

Proposition 9 Assumption [Det'] ensures that Assumption [C3] holds. 

Proof. As in the proof of Proposition HI to show that (|22|) admits the unique solution 8*, it is sufficient 
to prove that there exists I in A such that, for all 8 ^ 8* , det((X.i(6, #*)),; 6 /) ^ 0. It is equivalent to: for 
every 8 ^ 8* in © there exists a family of real coefficients (ci)i^a such that 

Y,cidet((X t (8,8*)). teI )>0. 

I£A 

With calculations as in (l26l) we obtain 



(28) 



pl^c/detftXi^,**))^) 



ieA 



J e £SU -^v fc +c ((8*-ef Vfc ) | £ Cjdei ( Vl) . . . , Vp )det(^(vi), . . . , *£(v p )) J iV(dvi) • • • Pv(dv P ). 



Thanks to [Det'] , this quantity is positive. ■ 

In the case where p = 2 and K = 3, [Det'(>)] is satisfied if and only if there exist a, b, c in R 3 such 
that for every vi and v 2 with dei(vi,v 2 ) > 

a det (*J 1,2} (vi), *^' 2} (v 2 )) +b det (*^' 3} ( Vl ), *^' 3} (v 2 )) + c det («f' 3} ( Vl ), tt< 2 " 3} (v 2 )) > 0. (29) 

If we denote by A the vectorial product in R 3 , the inequality in (|29|) means that the following set 

|*e(vi) A *e(v 2 ), for all Vi,v 2 such that det(vi, v 2 ) > o| (30) 

is included in the half space in R 3 with equation ex — by + az > 0. In the setting where V\ = 1 and V 2 is 
included in an interval [a,b], as for the area process, this condition becomes a geometrical characteristic 
of the curve jg = ^§({1} x [a, b]) in R 3 , which is easy to check visually. 

6 Extension in the presence of non-hereditary interaction 

In several recent papers, Gibbs processes with non hereditary interactions are considered, in particular in 
the domain of stochastic geometry (see [13], [E]). The parametric estimation of such models has also been 
investigated. The first results in this direction have been given in |16j via a pseudo-likelihood procedure 
based on a generalization of the Georgii-Nguycn-Zessin formula (QJ. The same kind of generalization is 
possible for the Takacs-Fikscl procedure. We address this improvement in this section. 

In the following, we do not assume that the energy V\(<p, 9) satisfies the heredity assumption ([2]). The 
first consequences are that the local energy V(x m \(p;6) is not defined in general and that the Georgii- 
Nguyen-Zessin formula is not available. Let us begin by presenting the generalization of this formula, as 
stated in |16j . Proposition 2, which is valid in the hereditary and non-hereditary settings. 

We first need to recall the concept of removable points which has been introduced in [IB], Definition 3. 

Definition 2 A point x m in a configuration p is called removable if there exists a bounded set A con- 
taining x such that V A (p)\x m ,9) < +oo. We denote by lZg(<p) the set of removable points in p. 

Let us remark that the removable set is only related to the support of the underlying Gibbs measure. 
The local energy V(x m \(p\x m ; 6) of any removable point x m £ He(<p) can then be defined by the classical 
expression ([3]) where A comes from Definition [5] In the hereditary case, all the points of (p are removable 
and we regain the classical definition of the local energy. 

The generalization of the Georgii-Nguyen-Zcssin formula is the following equation 



e( f h(x m ,<S>;9)e- v{xm ^ r) fi(dx m )) = E f V h (x m , $ \ x m ; 9) ) . (31) 



Let us notice that the only difference with the classical formula is that the sum is restricted to the 
removable points. Now, let us present the consequences of this formula on the Takacs-Fiksel procedure. 
Wc have to consider the two following cases: 

• When the support of the Gibbs measure does not depend on 9: the set of removable points IZe (p) 
does not depend on 9 either. In this case, the Takacs-Fiksel estimor is defined by 0, and Ca is as 
in ^) to the exception of the sum which is restricted to the removable points: 

C A (p;h,9):= [ h{x m ,p;9)e- v(xm ^ e) fi(dx m )- ^ h(x m , p \ x m ;9). (32) 

The sum is computable because by assumption, the set lZe*{p>) does not depend on 9*. In this 
situation, with the same assumptions [C] and [N], the consistency and the asymptotic normality 
of the estimator may be proved as in Section 7. 



When the support of the Gibbs measure depends on some parameters 9h c = (9i, ■ ■ ■ ,9 q ), q < p 
(called the hardcore parameters): the remaining parameters 9 sm = (9q+i, ■ • • ,0 P ) are supposed 
to parameterize the classical (or smooth) interaction between points. The set of removable points 
lZg(ip) therefore depends on 9] lc only. The estimation issue is more complicated in this case. Indeed, 
Assumption [C2] requires some regularities of the interaction with respect to the parameter 9, 
such as continuity, which clearly fail to be true for the support parameter 9 he- The Takacs-Fikscl 
procedure is therefore unable to estimate 9 he- Note that this problem is not specific to the presence of 
non-hereditary interactions, but arises as soon as some hardcore parameters have to be estimated. In 
Qj5], the authors solve this problem in both the hereditary and non-hereditary setting, by introducing 
a two-step estimation procedure. We can follow the same strategy here. In a first step, the estimator 
9y ic of the hardcore parameter is defined in a natural way according to the observed support of the 
point process (see Section 4.2.1 in [16]). Then, in a second step, the Takacs-Fiksel estimator 9 sm is 
defined by ([7]) with 



JAxM 

£ h(x m ^\x m ;9 sm ,9 hc ). (33) 

Let us remark that the estimator 9h c is plugged in the computation of Ca- In particular, the 
removable points are determined with respect to 9 he- As in |16j . the regularity and intcgrability 
assumptions of type [C] for 9 sm and conditions on the support of the Gibbs measure are required in 
order to obtain the consistency of (9h c ,9 sm ). The asymptotic normality is more difficult to obtain 
and no general results are available. In fact, it seems that there is no hope to expect asymptotic 
normality without managing the rate of convergence of 9 he, which should strongly depend on the 
model. 

7 A short simulation study 

The first aim of the present paper is to explore the asymptotic properties of the Takacs-Fiksel estimate. 
Given a model, the important question of correctly choosing the test functions is an open question and 
will not be treated here. Also, a complete comparison between all the existing parametric estimation 
methods has not been attempted. In this section, we just present a brief simulation study to illustrate 
the methodology and the asymptotic results. 

The model considered in this section is the (non-marked) Strauss process described by (|11[) with 
two parameters. Table [T] and Figure [5] give an empirical comparison of the approximated maximum 
likelihood estimate (MLE) described in [25], the maximum pscudo- likelihood estimate (MPLE), which is 
a particular case of the Takacs-Fiksel estimate, and the explicit Takacs-Fiksel estimate (TFE) described 
in Section 13.2.31 and in particular in (|T3l) . To generate Strauss point processes and to compute the MLE, 
the R package spatstat is used. The MPLE and the TFE are computed respectively from <j8j) and © 
where the integrals were approximated by Monte-Carloo Strauss point processes are generated on the 
window [0, 3] 2 © R where R is set to 0.05. Recall that this parameter corresponds to the finite range of 
the Strauss process. Then the estimates are computed on the window [0,t] 2 © R for t = 1, 2, 3, using a 
i?-erosion. We use the parametrization of spatstat, i.e. the intensity parameter is j3* := e -6 * 1 and the 
interaction parameter is 7* := e -6 * 2 . We set /3* = 100 and 7* = 0.5. The estimation of (/3*,7*) by the 
TFE is thus (£,7) = {e- d \e' d ' 2 ) where (0i,0 2 ) are defined in (fT5|). 

As expected, the three estimates have decreasing variance when the domain of observation grows. 
Since the explicit TFE is obtained very quickly, its behavior appears very satisfactory in comparison with 
the MLE and MPLE. 



1 The MPLE is also implemented in spatstat but we did not use it. In fact, it appears that the implementation therein 
leads to biased estimations. 



Estimates of the parameter /3* 

W T MLE MPLE 

"[Mp 103.50 (18.24) 
[0,2] 2 101.03(8.04) 



100 



TFE 



[0,3] 2 100.38(6.58) 



101.19 (16.92) 
100.33 (8.66) 
99.55 (5.47) 



101.44 (18.95) 
99.99 (8.71) 
99.96 (6.27) 



Estimates of the parameter 7* = 0.5 
W T MLE MPLE TFE 



"[oTTp 0.50 (0.17) 
[0,2] 2 0.50(0.08) 
[0,3] 2 0.51 (0.05) 



0.50 (0.16) 
0.50 (0.08) 
0.49 (0.05) 



0.52 (0.22) 

0.51(0.11) 

0.51 (0.06) 



Table 1: Empirical means and standard deviations between brackets for the MLE, MPLE and the (ex- 
plicit) TFE based on m — 500 replications of a Strauss point process with parameters /3* = 100, 7* = 0.5, 
and (known) interaction range R = 0.05, generated in the window [0, 3] 2 ©-R and estimated on the window 
W T = [0, t] 2 © R with ^-erosion for r = 1, 2, 3. 
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Figure 2: Boxplots of the 3 estimates of /3* (left) and 7* (right) when r = 3, based on m — 500 replications. 



Empirical covariance of r x 

r = 2 



h) T 



MC approximation of 

(£ T )~ 1 S£- 1 



0.037 
-0.051 



-0.051 
0.185 



0.031 
-0.048 



-0.048 
0.185 



0.035 
-0.051 



-0.051 
0.179 



0.034 
-0.045 



-0.045 
0.154 



Table 2: Comparison between the rcnormalizcd empirical covariance matrix of the estimates {61,62) 
(based on m = 500 replications of the Strauss process on [0, r] 2 © R as in Table Q} and the asymptotic 
covariance matrix (approximated by Monte-Carlo simulations). 



Now, we focus more particularly on the explicit TFE given by (fT3|) . We wish to assess whether 
the empirical covariance matrix of (9i,02) T agrees with the asymptotic covariance matrix deduced from 
(fl~7| in Theorem HI For this reason, we come back to the parameterization of the Strauss model of 
Section f3. 2. 31 From ([TT]). we deduce that the asymptotic covariance matrix of r x {6\, 02 ) T , as r tends 
to +og, is £ _1 (h, #*) T S(h, 6*)£ ~ 1 {h,9*) (a simplification occurs here since K = p = 2). Moreover, we 
can prove that the matrix £(h,9*) reduces to 



£(h,r) = |A| 



E(JV ,a) e^E(7V liA ) 
e iE(N hA ) 



where A is any bounded domain. Since the matrices £(h, 9*) and S(h, 6*) depend on expectations 
with respect to Pg, they are approximated by Monte-Carlo simulations. Table [2] contains the empirical 
covariance of r x (#i,#2) T , for r = 1,2,3 as well as the asymptotic covariance matrix approximated 
by Monte-Carlo simulations. Finally, in order to appreciate the Gaussian limit, Figure [3] compares the 
empirical distribution of the two estimates (^1,^2) with the expected limit when r = 3. Note that the 
same kind of results as in Tableland Figure |3] could have been easily obtained for ((3,j) by use of the 
delta method. 
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Figure 3: Histogram of t(9j — 9*) for j = 1 (intensity parameter, left) and j = 2 (interaction parameter, 
right), for the large window, i.e. r = 3. The curves correspond to the densities of the asymptotic Gaussian 
laws, which are centered with variance .034 (left) and .154 (right) according to Table [2j 



8 Proofs of asymptotic results 

8.1 Proof of Theorem [3] 

If there is more than one stationary Gibbs measure, then some non ergodic Gibbs measures automatically 
exist because, in the convex set of all Gibbs measures, only the extremal measures are ergodic. But any 
stationary Gibbs measure can be represented as a mixture of ergodic measures ([22], Theorem 14.10). Due 
to this decomposition, we can assume that Pg* is ergodic to prove the consistency of the Takacs-Fikscl 
estimate. The proof is split into two steps. 

Step 1. U An is a contrast function. Under [CI], the ergodic Lemma [2] and the GNZ formula given 
in §5§ may be applied to prove that Pg*-a.s. 

IKl-'CA^hkid) -> e(^(o m ,$ ; %-^° m i^)) -E(h k (o M ,$\o M -,e)) 

= E (M0 M , *; 0) (V y (° A/ l*^ - e -v(o»\*;9*? 

Therefore, as n — > +oo, one obtains Pg* — a.s. 

K 

U An ($; h, 9) -► U(9) := ^E (/i fe (0 M , <f>; 6) ( e -v(o"|»;«) - e 

k=l 

Note that U(6*) = 0. In addition with Assumptions [C2] and [C3], this proves that U An is a continuous 
contrast function vanishing only at 9*. 

Step 2. Modulus of continuity 

The modulus of continuity of the contrast process is defined for all ip € and all ?y > by 



-v(o M |$;0) _ „-y(o AJ |$ ; e*) x x 



W n 



far/) = sup{|t/ A „ (<p;h,0) - U An (ip;h,0')\ -.6,6' 6 0, \\9 - 9'\\ < V } 



This step aims at proving that there exists a sequence (ei)e>i, with eg — > as i — >• +oo such that for all 
t>\ 

P f\im sup (w„ U, -\ > e^j \=0. (34) 

Let 9, 9' e 0, then 

K r 

\U A J<p;h,9)-U A Jtp;h,9')\ < ^ \ I^T'I^ {<p\ h k , 0) - C An (tp; h k ,9')\ x 

fc=i ^ 

\K\- l {\C An {^h k ,9)\ + \C An {^h k ,9')\)\. (35) 

Under Assumption [C4], there exists uq > 1 such that for all n > uq we have for Pg+ — a.e. if 

\K\- l {\C An {ip;h k ,6)\ + \CKM-M,e')\) < 2\K n \- 1 f m^\h k (x m ,ip;9)\e~ v ^ m ^ ^(dx m ) 

JA„xM ee@ 



+2|A„|~ 1 Yl ™%\h k (x m ,v\x m ;0)\ 
< 7i) 



9£® 



where 



7l :=4x max [E (max|/ fc (O M ,*;0)| ) +E (max|/i fc (O M ,*;0)|e- v( ° M| * ;l '* ) 



k=l,...,K \ \8<E® J \de& 

Therefore for all n > no 

K 



|t/ An ( w h,0) -£/ Ar >;h,0')l <7i x T^ 1 ^2(A An (ip;h k ,e,e') + B An (ip;h k ,6,6')) , 



k=l 



where 

A^tohkAP) ■= I \fk(x m ,ip;e)-f k (x m ,v;6')\rtdx m ) 

J A„xM 

B A J^;h k: 9,9 r ) = Yl \hk{x m ,y\x m ;e)-h k {x m ,y\x m ;6')\. 

X m £tpA„ 

Under Assumption [C4] , one may apply the mean value theorem in W as follows: there exist £W , . . . , £( p ) S 
[mm(9i,9' 1 ),ma,x(6i,9' 1 )] x . . . x [min(6* p ,0'),max(0p,0' )] such that for all </? G £1 



A An { V] h h ,e,e') = / ^2(e j -e' j )(f k ^(x m , i p;^ 1 %^dx m ) 

JA„xM =1 
< px\\9-9'\\ f max ||f k (1) (x m ,v3; 6»)|| /i(da; m ) 

JA„xM 6e& 



'A„xM 

In a similar way, one may prove that for Pg* — a.e. (p 

B An ( V ;h kl 9 : 9 r )<px\\9-9'\\ J2 fag\\hw {1) (x m ,^\x m ;d)\\. 

X m £ip An 

Under Assumption [C4], there exists ni(k) > 1 such that for all n > ni(fc), we have for P^— a.e. </? 

1 



I A, 
where 



■(A An (<p;h k ,9,9') + B A Jtp;h k ,9,9')) < l2 \\9 - 9'\\ 



72 :=2px max E max ||f k (1) (0 M , $; 6>)|| +E max ||h k (1) (0 M , $; 6»)|| e - y(0 ' I*' 9 *) 
fe=i at \ \ee® J \ee® 

We finally obtain the following upper-bound for Pg*—a.e. ip, for all 9, 9' such that \\9 — 9'\\ < l/£ and for 
all n > N = max(no, maxfc n\{k)) 



with 7 = A" x 7! x 72 and therefore W n ((p, l/£) < 7 X j. Finally, since 

1\>7 

?i— >-+oo I V / ^ 



mGNn>m I. \ / J „>JV ^ \ / J 



for Pg* — a.e. <p, the expected result ([M]) is proved. 

Conclusion step. Steps 1 and 2 ensure the fact that we can apply Theorem 3.4.3 of [23] which asserts the 

almost sure convergence for minimum contrast estimators. 

8.2 Proof of Theorem H 

The proof is based on a classical result concerning asymptotic normality for minimum contrast estimators 
e.g. Theorem 3.4.5 of [23]. We split it into two different steps. 

Step 1. Asymptotic normality of U^ ($; h, 9*). 

We start with a Lemma which states a central limit theorem. Its proof uses a conditional centering 
assumption as in [28] . which holds due to the particular form of the contrast function ([6]). This scheme 
of proof is now well-known and we mainly refer to previous works for the technical details. 

Lemma 10 Under the Assumptions [Nl], [N2J and [N3J, the following convergence holds in distribu- 
tion, as n — > +00 

\A n \- 1/2 C A „($ ;h ,r) 4 jV(0,E(h,r)), (36) 

where S(h, 9*) and C An (ip; h, 9*) are defined in TheoremU\ 



Proof. The vector CA„(y;h, 0*) (of length K) corresponds to the vector of the h k — residuals for 
k = 1, . . . , K computed on the same domain A„ with — 0* , see [5] for a definition and practical study of 
this concept of residuals. The asymptotic behavior of the residuals process has been investigated in [T2] . 
In particular, with the notation of the present paper, the asymptotic normality of the vector Ca„ ($; h, 0) 
for general is obtained (see Proposition 4 in [H]). When 8 = 6*, the assumptions and the asymptotic 
covariance matrix of Proposition 4 in fT2] respectively reduce to [Nl-3] and (fT5|) . ■ 

Now, according to the definition of U An (tp; la., 6*), we have 

K 

ui 1 n ) ^;h,r)=2|A„|- 2 Y,c2(<P;hk,6*)C An (tp;h k ,0*). 

fe=i 

Under Assumption [C4], one may apply the ergodic Lemma [3J in order to derive Pg— a.s., as n — > +oo 
|A n r 1 ci 1) ($;/ lfc ,^)^Eff k (1) (0 M ,$;r)-h k (1) (0 M ,$;r) e - y (°" / l*^) N ). (37) 



It is easily checked that this expectation reduces to —£(h k , 0*), this vector of length p being defined by 
£(h k ,6*) := E (h k (O M ,<S>;0*)V^(O M \<5>-,0*)e- v< -° M ^A. Let us denote by £(h,0*) the (p,K) matrix 
(£(hi,0*), . . . , £(hx, 0*)), then we get the following decomposition 



lA^ug^M*) = 2|A„| 1 /2 X \A n \-zJ2c ( ll($;h k ,0*)C An ($;h k ,6*) 

fe=i 

= -2|A„|- 1 / 2 £(h,r)C A „(<i>;h,^) 

K 

+2^ [\K n \- l C^;h k ,6*) - {-£{h k ,e*))\ \A n \-^ 2 C An (^-h k ,0*). 
fc=i 

According to (f3"T)) and Lemma [TU1 Slutsky's Theorem implies that for any k 6 {1, . . . , if}, 

(|A„r 1 cW($ ; /i fe ,r)-f(/ lfe ,0*))|A„r 1 / 2 c A „($;^,0*)4o, 

as n — > +oo, the zero here being a vector of length p. Using again Lemma 1101 we finally reach the 
following convergence in distribution as n — > +oo 

|A„| 1 /2 U w ($; h? 0* ) 4 ^ ( 0j 4 £ (h; r ) s( h , 0*) £ (h, 0* ) T ) , 

where S(h, 0*) is defined by (0. 

Step 2. Convergence of U^*; h,0) for E V(0*) 

According to our definition and Assumption [N4], the (p,p) matrix U A (</?; h,0) is defined for i, j = 
l,...,pby 

K , 

(U£(W h, 0*)) . . = 2|A n |- 2 £ (cgfo ft fc ;0)J C A „(^; h k ,6) + (cgjfa h fc , 0)) . (cgfo ft*,*)) . 
*j fc=1 L y i 

Note also that C A (ip; h k ,9) and C A (<p; /ifc, 0) are defined by 



&H{^h k ,0) = [ fi 1) (x m ,i P ;e)fi(dx m )- J^ h[}\x m ,<p\x m ;e) 

££(¥>;fc*,0) = / fi 2) (x™,^;0V(dx™)- J] h£\x m ,<p\x m ;e). 

JA„xM „, m ^.„. 



,A » xM ^e^ 



Under Assumption [N4], then for all i,j = 1, . . . ,p and for any k = 1, . . . ,K, each normalized term 
A n | _1 CA n ($; /ifc, 0), |A n | _1 C^ ($;hk,0) and |A„| _1 Ca (<&;hk,0) satisfies the assumptions of the ergodic 
Lemma[2l Therefore, for any 9 G V(0*), there exists a matrix U^ '(h.,6) such that Pg — a.s. 

U^($;h,6»)^U (2) (h,e»). 

This justifies that, for n large enough, in a neighborhood of 0* , (U^ (</?; h, 0)) .. is uniformly bounded by 
2 x max eeV (0*) | (U (2) (h, 6>)) . .| for P *-a.c. ip. When 6 = (9*, recall, from ©, that lA^" 1 ^ ($;ft fe ,0*) 
converges almost surely to zero and that ([57)) holds. Hence, U (h, #*) reduces to 2£_(h, d*)£_(h, 9*) . 

Conclusion Step. From Theorem 3.4.5 of [23], Steps 1 and 2 ensure that the normalized difference 
|A„| 1/2 (U (2) (h,(9*) ($ n ($) - 6*\ - U^($;h,0*)) converges in probability to 0, which is the expected 
result. 
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